User-Assisted Archive Document Image Analysis for Digital Library Construction
نویسندگان
چکیده
A configurable archive document image analysis system for digital library construction has been designed using rapid prototyping and top-down iterative development methods. This approach has been found to be essential in order to capture the curators’ expertise about existing card archive structures, content and databases. The design currently achieves about 93% correct segmentation of the required archive card fields overall, with 81.3% of all archive cards in a testset of 2000 images having all fields correctly segmented and labelled. Analysis of errors in the testset indicates that heavily-annotated cards and non-standard card formats comprise 5-10% of the overall archive, and a significant proportion of these are unlikely to be resolvable without curatorial intervention.
منابع مشابه
A web2.0 collaborative cultural heritage archive with recommender system over trace based reasoning
Cultural heritage presents a big quantity of information; they entice different kinds of persons. In last decades, computer technology and internet helped bringing history to present life. Ancient and historical documents were digitized and exposed online. Therefore, cultural heritage digital libraries and web sites were created, first to enhance document preservation, and second to facilitate ...
متن کاملDocument Icons and Page Thumbnails: Issues in Construction of Document Thumbnails for Page-Image Digital Libraries
Digital libraries are increasingly based on digital page images, but techniques for constructing usable versions of these page images are largely folklore. This paper documents some issues encountered in creating various kinds of renderings of page images for the UpLib digital library system, and suggests approaches for each, based on both problem analysis and user feedback. Several factors imp...
متن کاملA method of content-based image retrieval for a spinal x-ray image database
The Lister Hill National Center for Biomedical Communications, a research and development division of the National Library of Medicine (NLM). maintains a digital archive of 17,000 cervical and lumbar spine images collected in the second National Health and Nutrition Examination Survey (NHANES II) conducted by the National Center for Health Statistics (NCHS). Classification of the images for the...
متن کاملLearning Document Image Features With SqueezeNet Convolutional Neural Network
The classification of various document images is considered an important step towards building a modern digital library or office automation system. Convolutional Neural Network (CNN) classifiers trained with backpropagation are considered to be the current state of the art model for this task. However, there are two major drawbacks for these classifiers: the huge computational power demand for...
متن کاملEnsuring Retrieval Effectiveness in Distributed Digital Libraries
• collection management; • organizing and indexing the materials for storage We find that dissemination of collection-wide information (CWI) in a distributed collection of documents is needed to and retrieval; achieve retrieval effectiveness comparable to that of a central• user interfaces and human-computer interaction; and ized collection. Complete dissemination is unnecessary. The • interope...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2003